AITopics | clinical workflow

Collaborating Authors

clinical workflow

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EndoBench: A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis

Neural Information Processing SystemsJun-9-2026, 15:47:23 GMT

Endoscopic procedures are essential for diagnosing and treating internal diseases, and multi-modal large language models (MLLMs) are increasingly applied to assist in endoscopy analysis. However, current benchmarks are limited, as they typically cover specific endoscopic scenarios and a small set of clinical tasks, failing to capture the real-world diversity of endoscopic scenarios and the full range of skills needed in clinical workflows. To address these issues, we introduce EndoBench, the first comprehensive benchmark specifically designed to assess MLLMs across the full spectrum of endoscopic practice with multi-dimensional capacities. EndoBench encompasses 4 distinct endoscopic scenarios, 12 specialized clinical tasks with 12 secondary subtasks, and 5 levels of visual prompting granularities, resulting in 6,832 rigorously validated VQA pairs from 21 diverse datasets. Our multi-dimensional evaluation framework mirrors the clinical workflow--spanning anatomical recognition, lesion analysis, spatial localization, and surgical operations--to holistically gauge the perceptual and diagnostic abilities of MLLMs in realistic scenarios. We benchmark 23 state-of-the-art models, including general-purpose, medical-specialized, and proprietary MLLMs, and establish human clinician performance as a reference standard. Our extensive experiments reveal: (1) proprietary MLLMs outperform open-source and medical-specialized models overall, but still trail human experts; (2) medical-domain supervised fine-tuning substantially boosts task-specific accuracy; and (3) model performance remains sensitive to prompt format and clinical task complexity. EndoBench establishes a new standard for evaluating and advancing MLLMs in endoscopy, highlighting both progress and persistent gaps between current models and expert clinical reasoning. We publicly release our benchmark and code.

artificial intelligence, natural language, proceedings, (7 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Diagnostic Medicine (0.84)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.64)

Add feedback

MTBBench: A Multimodal Sequential Clinical Decision-Making Benchmark in Oncology

Vasilev, Kiril, Misrahi, Alexandre, Jain, Eeshaan, Cheng, Phil F, Liakopoulos, Petros, Michielin, Olivier, Moor, Michael, Bunne, Charlotte

arXiv.org Artificial IntelligenceNov-26-2025

Multimodal Large Language Models (LLMs) hold promise for biomedical reasoning, but current benchmarks fail to capture the complexity of real-world clinical workflows. Existing evaluations primarily assess unimodal, decontextualized question-answering, overlooking multi-agent decision-making environments such as Molecular Tumor Boards (MTBs). MTBs bring together diverse experts in oncology, where diagnostic and prognostic tasks require integrating heterogeneous data and evolving insights over time. Current benchmarks lack this longitudinal and multimodal complexity. We introduce MTBBench, an agentic benchmark simulating MTB-style decision-making through clinically challenging, multimodal, and longitudinal oncology questions. Ground truth annotations are validated by clinicians via a co-developed app, ensuring clinical relevance. We benchmark multiple open and closed-source LLMs and show that, even at scale, they lack reliability -- frequently hallucinating, struggling with reasoning from time-resolved data, and failing to reconcile conflicting evidence or different modalities. To address these limitations, MTBBench goes beyond benchmarking by providing an agentic framework with foundation model-based tools that enhance multi-modal and longitudinal reasoning, leading to task-level performance gains of up to 9.0% and 11.2%, respectively. Overall, MTBBench offers a challenging and realistic testbed for advancing multimodal LLM reasoning, reliability, and tool-use with a focus on MTB environments in precision oncology.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2511.2049

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Add feedback

Reflections from Research Roundtables at the Conference on Health, Inference, and Learning (CHIL) 2025

Alsentzer, Emily, Charpignon, Marie-Laure, Chen, Bill, D'Souza, Niharika, Fries, Jason, Jiang, Yixing, Kashyap, Aparajita, Kim, Chanwoo, Lee, Simon, Mandyam, Aishwarya, Mbilinyi, Ashery, Mehandru, Nikita, Nagesh, Nitish, Nuwagira, Brighton, Pierson, Emma, Pillai, Arvind, Sano, Akane, Syeda-Mahmood, Tanveer, Yadav, Shashank, Adhanom, Elias, Afza, Muhammad Umar, Archer, Amelia, Bedi, Suhana, Bikia, Vasiliki, Chang, Trenton, Chen, George H., Chen, Winston, Chiang, Erica, Choi, Edward, Ciora, Octavia, Dozie-Nnamah, Paz, Elsharief, Shaza, Engelhard, Matthew, Eshragh, Ali, Feng, Jean, Fessel, Josh, Fleming, Scott, Fong, Kei Sen, Frost, Thomas, Gadgil, Soham, Gichoya, Judy, Hershkovich, Leeor, Im, Sujeong, Jain, Bhavya, Jeanselme, Vincent, Jia, Furong, Jin, Qixuan, Jin, Yuxuan, Kapash, Daniel, Kapoor, Geetika, Kiafar, Behdokht, Kleiner, Matthias, Kraft, Stefan, Kumar, Annika, Kyung, Daeun, Liang, Zhongyuan, Lin, Joanna, Liu, Qianchu, Liu, Chang, Luan, Hongzhou, Lunt, Chris, López, Leopoldo Julían Lechuga, McDermott, Matthew B. A., Noroozizadeh, Shahriar, O'Brien, Connor, Oh, YongKyung, Ota, Mixail, Pfohl, Stephen, Pi, Meagan, Pias, Tanmoy Sarkar, Rocheteau, Emma, Sethi, Avishaan, Shirakawa, Toru, Silver, Anita, Simha, Neha, Stankeviciute, Kamile, Sunog, Max, Szolovits, Peter, Tang, Shengpu, Tang, Jialu, Tierney, Aaron, Valdovinos, John, Wallace, Byron, Wang, Will Ke, Washington, Peter, Weiss, Jeremy, Wolfe, Daniel, Wong, Emily, Yun, Hye Sun, Zhang, Xiaoman, Zhang, Xiao Yu Cindy, Jeong, Hayoung, Thakoor, Kaveri A.

arXiv.org Artificial IntelligenceNov-5-2025

The 6th annual Conference on Health, Inference, and Learning (CHIL 2025), hosted by the Association for Health Learning and Inference (AHLI), was held in person on June 25-27, 2025, at the University of California, Berkeley, in Berkeley, California, USA. As part of this year's program, we hosted Research Roundtables to catalyze collaborative, small-group dialogue around critical, timely topics at the intersection of machine learning and healthcare. Each roundtable was moderated by a team of senior and junior chairs who fostered open exchange, intellectual curiosity, and inclusive engagement. The sessions emphasized rigorous discussion of key challenges, creative exploration of emerging opportunities, and collective ideation toward actionable directions in the field. Overall, the Research Roundtables brought together a diverse mix of participants, including academic researchers, clinicians, industry professionals, and policy experts. In total, eight roundtables were held across two 30-minute sessions, with a brief transition break to allow participants to join multiple discussions.

data mining, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.15217

Country: North America > United States > California > Alameda County > Berkeley (0.54)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)
Overview (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(7 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Translating Milli/Microrobots with A Value-Centered Readiness Framework

Ceylan, Hakan, Sinibaldi, Edoardo, Misra, Sanjay, Pasricha, Pankaj J., Hutmacher, Dietmar W.

arXiv.org Artificial IntelligenceOct-15-2025

Untethered mobile milli/microrobots hold transformative potential for interventional medicine by enabling more precise and entirely non-invasive diagnosis and therapy. Realizing this promise requires bridging the gap between groundbreaking laboratory demonstrations and successful clinical integration. Despite remarkable technical progress over the past two decades, most millirobots and microrobots remain confined to laboratory proof-of-concept demonstrations, with limited real-world feasibility. In this Review, we identify key factors that slow translation from bench to bedside, focusing on the disconnect between technical innovation and real-world application. We argue that the long-term impact and sustainability of the field depend on aligning development with unmet medical needs, ensuring applied feasibility, and integrating seamlessly into existing clinical workflows, which are essential pillars for delivering meaningful patient outcomes. To support this shift, we introduce a strategic milli/microrobot Technology Readiness Level framework (mTRL), which maps system development from initial conceptualization to clinical adoption through clearly defined milestones and their associated stepwise activities. The mTRL model provides a structured gauge of technological maturity, a common language for cross-disciplinary collaboration and actionable guidance to accelerate translational development toward new, safer and more efficient interventions.

artificial intelligence, human computer interaction, microrobot, (15 more...)

arXiv.org Artificial Intelligence

2510.1209

Country:

Oceania > Australia > Queensland (0.15)
North America > United States > Arizona (0.14)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Surgery (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.46)

Add feedback

Whole-body Representation Learning For Competing Preclinical Disease Risk Assessment

Seletkov, Dmitrii, Starck, Sophie, Erdur, Ayhan Can, Zhang, Yundi, Rueckert, Daniel, Braren, Rickmer

arXiv.org Artificial IntelligenceAug-5-2025

Reliable preclinical disease risk assessment is essential to move public healthcare from reactive treatment to proactive identification and prevention. However, image-based risk prediction algorithms often consider one condition at a time and depend on hand-crafted features obtained through segmentation tools. We propose a whole-body self-supervised representation learning method for the preclinical disease risk assessment under a competing risk modeling. This approach outperforms whole-body radiomics in multiple diseases, including cardiovascular disease (CVD), type 2 diabetes (T2D), chronic obstructive pulmonary disease (COPD), and chronic kidney disease (CKD). Simulating a preclinical screening scenario and subsequently combining with cardiac MRI, it sharpens further the prediction for CVD subgroups: ischemic heart disease (IHD), hypertensive diseases (HD), and stroke. The results indicate the translational potential of whole-body representations as a standalone screening modality and as part of a multi-modal framework within clinical workflows for early personalized risk stratification.

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Artificial Intelligence

2508.02307

Country: Europe > Germany (0.47)

Genre: Research Report > Experimental Study (0.47)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Security & Privacy (0.85)

Add feedback

MAIA: A Collaborative Medical AI Platform for Integrated Healthcare Innovation

Bendazzoli, Simone, Persson, Sanna, Astaraki, Mehdi, Pettersson, Sebastian, Grozman, Vitali, Moreno, Rodrigo

arXiv.org Artificial IntelligenceJul-29-2025

Artificial Intelligence (AI) integration in healthcare has emerged as a transfor-mative force, promising to revolutionize patient care, optimize resource allocation, and enhance clinical decision-making [2, 10]. As the healthcare ecosystem increasingly recognizes the importance of AI-powered tools, there is a growing need for collaborative platforms to facilitate the development, deployment, and management of AI solutions in medical settings [7, 13]. Modern healthcare institutions are facing complex challenges that demand sophisticated technological solutions. A comprehensive Medical AI Platform can serve as a powerful foundation for addressing these complex needs, effectively bridging technological capabilities with clinical requirements. One of the open challenges in healthcare is the management of the vast amounts of data handled in clinical settings. Cloud-based medical AI platforms can provide new opportunities for computational resource sharing, enabling institutions to optimize data storage, and collaborative research environments. By creating a unified and standardised ecosystem, these platforms break down traditional institutional barriers, facilitating knowledge exchange between medical professionals, data scientists, and researchers.

artificial intelligence, machine learning, platform, (19 more...)

arXiv.org Artificial Intelligence

2507.19489

Country: Europe > Sweden (0.16)

Genre:

Workflow (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)

Add feedback

The case for delegated AI autonomy for Human AI teaming in healthcare

Jia, Yan, Evans, Harriet, Porter, Zoe, Graham, Simon, McDermid, John, Lawton, Tom, Snead, David, Habli, Ibrahim

arXiv.org Artificial IntelligenceMar-24-2025

In this paper we propose an advanced approach to integrating artificial intelligence (AI) into healthcare: autonomous decision support. This approach allows the AI algorithm to act autonomously for a subset of patient cases whilst serving a supportive role in other subsets of patient cases based on defined delegation criteria. By leveraging the complementary strengths of both humans and AI, it aims to deliver greater overall performance than existing human-AI teaming models. It ensures safe handling of patient cases and potentially reduces clinician review time, whilst being mindful of AI tool limitations. After setting the approach within the context of current human-AI teaming models, we outline the delegation criteria and apply them to a specific AI-based tool used in histopathology. The potential impact of the approach and the regulatory requirements for its successful implementation are then discussed.

artificial intelligence, delegation criteria, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.18778

Country:

North America > United States (0.28)
Europe > United Kingdom > England > West Midlands > Coventry (0.05)
Europe > United Kingdom > England > West Yorkshire > Bradford (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.95)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Government Relations & Public Policy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Approach to Designing CV Systems for Medical Applications: Data, Architecture and AI

Ryabtsev, Dmitry, Vasilyev, Boris, Shershakov, Sergey

arXiv.org Artificial IntelligenceJan-24-2025

This paper introduces an innovative software system for fundus image analysis that deliberately diverges from the conventional screening approach, opting not to predict specific diagnoses. Instead, our methodology mimics the diagnostic process by thoroughly analyzing both normal and pathological features of fundus structures, leaving the ultimate decision-making authority in the hands of healthcare professionals. Our initiative addresses the need for objective clinical analysis and seeks to automate and enhance the clinical workflow of fundus image examination. The system, from its overarching architecture to the modular analysis design powered by artificial intelligence (AI) models, aligns seamlessly with ophthalmological practices. Our unique approach utilizes a combination of state-of-the-art deep learning methods and traditional computer vision algorithms to provide a comprehensive and nuanced analysis of fundus structures. We present a distinctive methodology for designing medical applications, using our system as an illustrative example. Comprehensive verification and validation results demonstrate the efficacy of our approach in revolutionizing fundus image analysis, with potential applications across various medical domains.

application, clinical workflow, clinician, (14 more...)

arXiv.org Artificial Intelligence

2501.14689

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
Europe > Netherlands > Utrecht (0.04)
Europe > Latvia > Riga Municipality > Riga (0.04)

Genre:

Research Report (0.70)
Workflow (0.52)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models

Lee, Chanseo, Kumar, Sonu, Vogt, Kimon A., Meraj, Sam, Vogt, Antonia

arXiv.org Artificial IntelligenceNov-20-2024

The increasing demand for multilingual capabilities in healthcare underscores the need for AI models adept at processing diverse languages, particularly in clinical documentation and decision-making. Arabic, with its complex morphology, syntax, and diglossia, poses unique challenges for natural language processing (NLP) in medical contexts. This case study evaluates Sporo AraSum, a language model tailored for Arabic clinical documentation, against JAIS, the leading Arabic NLP model. Using synthetic datasets and modified PDQI-9 metrics modified ourselves for the purposes of assessing model performances in a different language. The study assessed the models' performance in summarizing patient-physician interactions, focusing on accuracy, comprehensiveness, clinical utility, and linguistic-cultural competence. Results indicate that Sporo AraSum significantly outperforms JAIS in AI-centric quantitative metrics and all qualitative attributes measured in our modified version of the PDQI-9. AraSum's architecture enables precise and culturally sensitive documentation, addressing the linguistic nuances of Arabic while mitigating risks of AI hallucinations. These findings suggest that Sporo AraSum is better suited to meet the demands of Arabic-speaking healthcare environments, offering a transformative solution for multilingual clinical workflows. Future research should incorporate real-world data to further validate these findings and explore broader integration into healthcare systems.

arabic, information, sporo arasum, (13 more...)

arXiv.org Artificial Intelligence

2411.13518

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

AI-driven View Guidance System in Intra-cardiac Echocardiography Imaging

Huh, Jaeyoung, Klein, Paul, Funka-Lea, Gareth, Sharma, Puneet, Kapoor, Ankur, Kim, Young-Ho

arXiv.org Artificial IntelligenceSep-26-2024

Abstract-- Intra-cardiac Echocardiography (ICE) is a crucial imaging modality used in electrophysiology (EP) and structural heart disease (SHD) interventions, providing realtime, high-resolution views from within the heart. Despite its advantages, effective manipulation of the ICE catheter requires significant expertise, which can lead to inconsistent outcomes, particularly among less experienced operators. To address this challenge, we propose an AIdriven closed-loop view guidance system with human-inthe-loop feedback, designed to assist users in navigating ICE imaging without requiring specialized knowledge. Our method models the relative position and orientation vectors between arbitrary views and clinically defined ICE views in a spatial coordinate system, guiding users on how to manipulate the ICE catheter to transition from the current view to the desired view over time. Overview of the proposed view guidance system. The primary use cases of the ICE imaging involve visualizing target anatomy, detecting and tracking therapeutic devices, and validating treatments in real-time. HE Intra-cardiac Echocardiography (ICE) is a sophisticated imaging modality that offers real-time, highresolution have significant expertise in interpreting anatomical views views from within the heart, making it an invaluable via ICE images and skillfully maneuvering the ICE catheter tool in both electrophysiology (EP) and structural heart disease using two knobs (anterior-posterior, right-left) and the rotating/translating (SHD) interventions.

catheter, ice catheter, orientation, (15 more...)

arXiv.org Artificial Intelligence

2409.16898

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Africa > Middle East > Morocco > Marrakesh-Safi Region > Marrakesh (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Add feedback